Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs] Add user guide for configuring authentication for Ray clusters #49080

Merged
merged 3 commits into from
Dec 13, 2024

Conversation

andrewsykim
Copy link
Contributor

Why are these changes needed?

Add a user guide for configuring authenticadtion and access control for Ray clusters using KubeRay and Kubernetes RBAC.

Related issue number

Based on REP: https://github.com/ray-project/enhancements/blob/main/reps/2024-05-21-kuberay-authentication.md

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@kevin85421 kevin85421 self-assigned this Dec 4, 2024
@Moonquakes
Copy link

Hi @andrewsykim , I found that the protection in the document is mainly for submitting ray jobs through port 8265. Is there a corresponding protection mechanism and document for interactive access to Ray Cluster through port 10001? (It seems that the ray.init interface needs to be modified, but I am not sure if it is currently supported)

@andrewsykim
Copy link
Contributor Author

@Moonquakes that's correct, for now we are starting with authentication for Ray dashboard only, in the future we will consider client port. @kevin85421 suggested we actually stay away from recommending interactive client but I can't recall the reasons

Copy link
Member

@kevin85421 kevin85421 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@@ -27,6 +27,7 @@ user-guides/tls
user-guides/k8s-autoscaler
user-guides/static-ray-cluster-without-kuberay
user-guides/kubectl-plugin
user-guides/kuberay-auth
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, add:

* {ref}`kuberay-auth`

below. Currently, the page only appears in the sidebar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


## Prerequisites

* A Kubernetes cluster (this guide uses GKE, but the concepts apply to other Kubernetes distributions).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* A Kubernetes cluster (this guide uses GKE, but the concepts apply to other Kubernetes distributions).
* A Kubernetes cluster (this guide uses GKE, but the concepts apply to other Kubernetes distributions).

## Prerequisites

* A Kubernetes cluster (this guide uses GKE, but the concepts apply to other Kubernetes distributions).
* `kubectl` installed and configured to interact with your cluster.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* `kubectl` installed and configured to interact with your cluster.
* `kubectl` installed and configured to interact with your cluster.


* A Kubernetes cluster (this guide uses GKE, but the concepts apply to other Kubernetes distributions).
* `kubectl` installed and configured to interact with your cluster.
* `gcloud` CLI installed and configured (if using GKE).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* `gcloud` CLI installed and configured (if using GKE).
* `gcloud` CLI installed and configured (if using GKE).

* A Kubernetes cluster (this guide uses GKE, but the concepts apply to other Kubernetes distributions).
* `kubectl` installed and configured to interact with your cluster.
* `gcloud` CLI installed and configured (if using GKE).
* Helm installed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Helm installed.
* [Helm](https://helm.sh/) installed.

* `kubectl` installed and configured to interact with your cluster.
* `gcloud` CLI installed and configured (if using GKE).
* Helm installed.
* Ray installed locally.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Ray installed locally.
* Ray installed locally.

Get a token for the `ray-user` service account and store it in the `RAY_JOB_HEADERS` environment variable:

```bash
export RAY_JOB_HEADERS="{\"Authorization\": \"Bearer $(kubectl create token ray-user)\"}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a note that kubectl create token is only supported in kubectl versions >= 1.24 (ref) or update the "Prerequisites" section?

image

Copy link
Contributor Author

@andrewsykim andrewsykim Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1.24 has been EOL for a while now, it's probably safe to assume most users are using a newer version at this point?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed this documentation and realized that I am using version 1.23.X on the client side. On the server side, I think most people use newer Kubernetes versions, but on the client side, users might not upgrade as frequently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, I'll add a note about this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


# Configure Ray Clusters with Authentication and Access Control using KubeRay

This guide demonstrates how to secure Ray clusters deployed with KubeRay by enabling authentication and access control using Kubernetes Role-Based Access Control (RBAC).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add the current limitations to the documentation? For example, it currently doesn't support RayJob and RayService. How can I use the browser to view the dashboard? (follow up)

Copy link
Contributor Author

@andrewsykim andrewsykim Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, I can also add some instructions for accessing the dashboard via browser

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added instructions for using browser to access Ray dashboard

@kevin85421
Copy link
Member

@Moonquakes, yeah, the Ray maintainers currently recommend that users avoid using the Ray client. It has not been actively maintained for years due to some fundamental stability issues. We have been discussing Ray Client V2, but it is still in the planning stage for now.

@andrewsykim andrewsykim force-pushed the kuberay-auth branch 3 times, most recently from 8a2029f to 2b643b7 Compare December 9, 2024 20:58

This guide demonstrates how to secure Ray clusters deployed with KubeRay by enabling authentication and access control using Kubernetes Role-Based Access Control (RBAC).

> **Note:** This guide is only supported for the RayCluster custom resource.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How long does the token last before it expires? I followed the documentation before lunch, but after lunch, the token stopped working, and I had to create a new one. Not necessary to address this in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe tokens last 1hour by default, you can specify the duration though:

kubectl create token ray-user --duration=24h

The server may return an expiration shorter than what the client requests though

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind adding some comments regarding the expiration? Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the example to include --duration so users know the flag exists

@kevin85421
Copy link
Member

I will ping our doc team to review this PR.

@angelinalg
Copy link
Contributor

Thank you for contributing to Ray docs. Just some style nits to be more consistent with our style guide. Thank you.

@andrewsykim
Copy link
Contributor Author

@angelinalg I didn't see any comments for style nits, did you forget to submit the review?

@@ -0,0 +1,207 @@
(kuberay-auth)=

# Configure Ray Clusters with Authentication and Access Control using KubeRay
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Configure Ray Clusters with Authentication and Access Control using KubeRay
# Configure Ray clusters with authentication and access control using KubeRay

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to standardize on sentence case for titles.


## Prerequisites

* A Kubernetes cluster (this guide uses GKE, but the concepts apply to other Kubernetes distributions).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* A Kubernetes cluster (this guide uses GKE, but the concepts apply to other Kubernetes distributions).
* A Kubernetes cluster. This guide uses GKE, but the concepts apply to other Kubernetes distributions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoiding excessive use of parentheses.


* A Kubernetes cluster (this guide uses GKE, but the concepts apply to other Kubernetes distributions).
* `kubectl` installed and configured to interact with your cluster.
* `gcloud` CLI installed and configured (if using GKE).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* `gcloud` CLI installed and configured (if using GKE).
* `gcloud` CLI installed and configured, if using GKE.

* [Helm](https://helm.sh/) installed.
* Ray installed locally.

## Create a GKE Cluster (or use an existing cluster)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## Create a GKE Cluster (or use an existing cluster)
## Create or use an existing GKE Cluster


## Create a GKE Cluster (or use an existing cluster)

If you don't have a Kubernetes cluster, create one using the following command (or adapt it for your cloud provider):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If you don't have a Kubernetes cluster, create one using the following command (or adapt it for your cloud provider):
If you don't have a Kubernetes cluster, create one using the following command, or adapt it for your cloud provider:


The output should be `yes`.

## Submit a Ray Job with Authentication
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## Submit a Ray Job with Authentication
## Submit a Ray job with authentication

ray job submit --address http://localhost:8265 -- python -c "import ray; ray.init(); print(ray.cluster_resources())"
```

The job should now succeed, and you'll see output similar to this:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The job should now succeed, and you'll see output similar to this:
The job should now succeed, and you should see output similar to this:


## Verify Access Using Cloud IAM (Optional)

Most cloud providers allow you to authenticate to your Kubernetes cluster as your cloud IAM user. This is a convenient way to interact with the cluster without managing separate Kubernetes credentials.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Most cloud providers allow you to authenticate to your Kubernetes cluster as your cloud IAM user. This is a convenient way to interact with the cluster without managing separate Kubernetes credentials.
Most cloud providers allow you to authenticate to the Kubernetes cluster as your cloud IAM user. This method is a convenient way to interact with the cluster without managing separate Kubernetes credentials.

------------------------------------------
```

## Verify Access Using Cloud IAM (Optional)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## Verify Access Using Cloud IAM (Optional)
## Verify access using cloud IAM (Optional)

ray job submit --address http://localhost:8265 -- python -c "import ray; ray.init(); print(ray.cluster_resources())"
```

The job should succeed if your cloud user has the necessary Kubernetes RBAC permissions (you may need to configure additional RBAC rules for your cloud user).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The job should succeed if your cloud user has the necessary Kubernetes RBAC permissions (you may need to configure additional RBAC rules for your cloud user).
The job should succeed if your cloud user has the necessary Kubernetes RBAC permissions. You may need to configure additional RBAC rules for your cloud user.

@angelinalg
Copy link
Contributor

angelinalg commented Dec 12, 2024

So sorry for the oversight, @andrewsykim!
And BTW, hello! Hope you're doing well.

Signed-off-by: Andrew Sy Kim <[email protected]>
@andrewsykim
Copy link
Contributor Author

andrewsykim commented Dec 12, 2024

Thanks @angelinalg! I'm doing great, hope you're doing well too! :)

@Moonquakes
Copy link

@Moonquakes, yeah, the Ray maintainers currently recommend that users avoid using the Ray client. It has not been actively maintained for years due to some fundamental stability issues. We have been discussing Ray Client V2, but it is still in the planning stage for now.

Thanks for your answer! However, Ray Client will be very helpful in the scenario of using Jupyter, which can greatly improve development efficiency. I look forward to the release of Ray Client v2!

@kevin85421
Copy link
Member

Ray Client will be very helpful in the scenario of using Jupyter, which can greatly improve development efficiency.

Agree. A good Ray client provides better UX comparing with ray job submit.

@kevin85421 kevin85421 added the go add ONLY when ready to merge, run all tests label Dec 13, 2024
@jjyao jjyao merged commit 7626673 into ray-project:master Dec 13, 2024
6 checks passed
ujjawal-khare pushed a commit to ujjawal-khare-27/ray that referenced this pull request Dec 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants